NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

View-Invariant Policy Learning via Zero-Shot Novel View Synthesis

Tian, Stephen; Wulfe, Blake; Sargent, Kyle; Liu, Katherine; Zakharov, Sergey; Guizilini, Vitor; Wu, Jiajun (November 2024, Conference on Robot Learning (CoRL))

Full Text Available
Unsupervised Discovery and Composition of Object Light Fields

Smith, Cameron; Yu, Hong-Xing; Zakharov, Sergey; Durand, Frédo; Tenenbaum, Joshua B.; Wu, Jiajun; Sitzmann, Vincent (June 2023, Transactions on machine learning research)
Larochelle, Hugo; Kamath, Gautam; Hadsell, Raia; Cho, Kyunghyun (Ed.)
Neural scene representations, both continuous and discrete, have recently emerged as a powerful new paradigm for 3D scene understanding. Recent efforts have tackled unsupervised discovery of object-centric neural scene representations. However, the high cost of ray-marching, exacerbated by the fact that each object representation has to be ray-marched separately, leads to insufficiently sampled radiance fields and thus, noisy renderings, poor framerates, and high memory and time complexity during training and rendering. Here, we propose to represent objects in an object-centric, compositional scene representation as light fields. We propose a novel light field compositor module that enables reconstructing the global light field from a set of object-centric light fields. Dubbed Compositional Object Light Fields (COLF), our method enables unsupervised learning of object-centric neural scene representations, state-of-the-art reconstruction and novel view synthesis performance on standard datasets, and rendering and training speeds at orders of magnitude faster than existing 3D approaches.
more » « less
Full Text Available
Multi-Object Manipulation via Object-Centric Neural Scattering Functions

https://doi.org/10.1109/CVPR52729.2023.00871

Tian, Stephen; Cai, Yancheng; Yu, Hong–Xing; Zakharov, Sergey; Liu, Katherine; Gaidon, Adrien; Li, Yunzhu; Wu, Jiajun (June 2023, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))

Full Text Available
Neural Groundplans: Persistent Neural Scene Representations from a Single Image

Sharma, Prafull; Tewari, Ayush; Du, Yilun; Zakharov, Sergey; Ambrus, Rares Andrei; Gaidon, Adrien; Freeman, William T.; Durand, Fredo; Tenenbaum, Joshua B.; Sitzmann, Vincent (February 2023, International Conference on Learning Representations)

We present a method to map 2D image observations of a scene to a persistent 3D scene representation, enabling novel view synthesis and disentangled representation of the movable and immovable components of the scene. Motivated by the bird’s-eye-view (BEV) representation commonly used in vision and robotics, we propose conditional neural groundplans, ground-aligned 2D feature grids, as persistent and memory-efficient scene representations. Our method is trained self-supervised from unlabeled multi-view observations using differentiable rendering, and learns to complete geometry and appearance of occluded regions. In addition, we show that we can leverage multi-view videos at training time to learn to separately reconstruct static and movable components of the scene from a single image at test time. The ability to separately reconstruct movable objects enables a variety of downstream tasks using simple heuristics, such as extraction of object-centric 3D representations, novel view synthesis, instance-level segmentation, 3D bounding box prediction, and scene editing. This highlights the value of neural groundplans as a backbone for efficient 3D scene understanding models.
more » « less
Full Text Available

Search for: All records